Abstract: De-duplication is the way toward deciding all classes of data inside an informational collection that imply a similar genuine life/world element. The information accumulated from different assets may have quality issues in it. The idea to recognize copies by utilizing windowing and blocking procedure. The goal is to accomplish better exactness, great effectiveness and furthermore to decrease the false positive rate all are as per the assessed similitudes of records. De-duplication is a property which gives extra data of similitudes between the two substances. In this paper the essential concentrate is given on correct ID of copies in the database by applying idea of windowing and blocking. The goal is to accomplish better exactness, great proficiency and furthermore to diminish the false positive rate all are as per the evaluated likenesses of records.

Keywords: Access control, big data, cloud computing, data deduplication, proxy re-encryption.